Add multi-instance SDK support via delegating providers#177
Add multi-instance SDK support via delegating providers#177JacksonWeber wants to merge 6 commits into
Conversation
Introduce createMicrosoftOpenTelemetryInstance to run multiple isolated SDK instances in one Node.js runtime. Parent (delegating) Tracer/Meter/Logger providers route per-call to the current child instance, resolved via an AsyncLocalStorage-backed ambient context (runWithInstance) with a default fallback. Each instance owns its own resource, sampler, processors, readers, and exporters. Additive and opt-in; the existing useMicrosoftOpenTelemetry single-instance path is unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds an opt-in multi-instance Microsoft OpenTelemetry SDK mode that allows multiple isolated telemetry pipelines to coexist in a single Node.js process by registering global delegating providers that route per-call to an “ambient” (AsyncLocalStorage-bound) instance, falling back to a default instance.
Changes:
- Introduces
createMicrosoftOpenTelemetryInstance()/runWithMicrosoftOpenTelemetryInstance()and theMicrosoftOpenTelemetryInstancehandle type. - Implements multi-instance runtime: instance registry + id binding, one-time global setup (context manager/propagator + parent providers), and delegating tracer/meter/logger providers.
- Adds functional coverage for isolation and ambient routing; promotes
@opentelemetry/context-async-hooksto a direct dependency.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/internal/functional/multiInstance.test.ts | New functional tests validating per-instance isolation and ambient routing via global API |
| src/types.ts | Adds MicrosoftOpenTelemetryInstance public interface type |
| src/index.ts | Re-exports new multi-instance APIs/types from the package entrypoint |
| src/distro/multiInstance/instanceRegistry.ts | Registry + AsyncLocalStorage-backed “current instance” binding and resolution |
| src/distro/multiInstance/instance.ts | Builds per-instance child providers/pipelines and lifecycle methods (flush/shutdown) |
| src/distro/multiInstance/index.ts | Exposes multi-instance APIs from the distro layer |
| src/distro/multiInstance/globalSetup.ts | One-time process-global setup for context manager/propagator + parent providers |
| src/distro/multiInstance/delegatingProviders.ts | Parent providers + delegating tracer/meter/logger implementations |
| src/distro/index.ts | Surfaces multi-instance exports through src/distro |
| package.json | Adds @opentelemetry/context-async-hooks direct dependency |
| package-lock.json | Locks the new dependency for npm ci installs |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- withInstance: skip binding unknown/stale ids so resolution falls back to the default instance instead of producing silent no-op telemetry. - instance.shutdown: wrap disposers via Promise.resolve().then so a synchronous throw is captured and does not abort the rest of shutdown. - multiInstance test: also disable the logs provider and the global context manager in afterEach to prevent cross-test contamination. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rads-1996
left a comment
There was a problem hiding this comment.
Is it possible to break the PR into small chunks, so it is easier to review? Or does it make logical sense to have everything together?
I don't believe that will work well for this PR as it implements a single contained feature. If a single component of it is split off it won't function. Only way I can think to split it is per signal, but I'm not sure that'd make it much easier to review. |
Each createMicrosoftOpenTelemetryInstance() now builds its own OpenTelemetry instrumentation set from its instrumentationOptions and binds it directly to that instance's providers via registerInstrumentations, so different exporters (e.g. Azure Monitor vs. A365) can run different instrumentations and settings in the same process. Adds A365 export support and per-instance A365 GenAI instrumentation defaults, tracks per-instance registration in SDKStats, and shares _applyA365InstrumentationDefaults between the single- and multi-instance paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolve dep version conflicts (adopt main's 0.219.0/2.8.0, keep context-async-hooks) and adapt to api-logs createNoopLogger. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…instance DelegatingMeter previously resolved the instance at instrument creation time, pinning .add()/.record() to whichever instance was current then (usually the default). Synchronous instruments now re-resolve the current instance on every measurement so metrics follow runWithInstance like traces/logs. Observable instruments remain bound at creation (collected async, outside any scope). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…g test Add afterAll cleanup (cm.disable() + context.disable()) so the AsyncLocalStorageContextManager installed in beforeAll does not leak into other tests in the shared Vitest worker. Also document the multi-instance public API in the README. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Adds opt-in support for running multiple isolated Microsoft OpenTelemetry SDK instances in a single Node.js runtime. Today
useMicrosoftOpenTelemetry()registers global providers per signal, so a second initialization clobbers the first. This PR introduces a parent/child delegating-provider architecture so independent pipelines can coexist — and, crucially, each instance owns its own set of OpenTelemetry instrumentations, settings, and sampling, bound directly to that instance's providers.The prioritized scenario this enables: a customer registers different instrumentations for the Azure Monitor exporter than for the A365 (or OTLP) exporter in the same process (e.g. full HTTP/SQL/Redis to Azure Monitor, GenAI-only to A365).
GA constraint respected: all changes are additive and non-breaking. The existing single-instance
useMicrosoftOpenTelemetry()path is untouched.What's new
createMicrosoftOpenTelemetryInstance(options, { makeDefault? })returns aMicrosoftOpenTelemetryInstancehandle (getTracer/getMeter/getLogger,runWithInstance,forceFlush,shutdown). Each instance builds a standalone pipeline plus its own instrumentations, sampler, and exporter.runWithMicrosoftOpenTelemetryInstance(id, fn)binds an ambientcurrent instanceso code using the global OTel API routes to the right pipeline.Design
src/distro/multiInstance/:current instance.NodeTracerProvider/MeterProvider/LoggerProvider(noNodeSDK.start(), no global registration), wiring Azure Monitor handlers, A365 export, caller processors, and a console fallback. It then builds this instance's instrumentations from its owninstrumentationOptionsand binds them to the child providers viaregisterInstrumentations, before registering the instance.Per-instance instrumentations (the key mechanism). Because each instrumentation is bound directly to its instance's providers, the spans/metrics/logs it produces flow only to that instance's exporter. Two instances with different
instrumentationOptionstherefore feed their respective exporters with different instrumentation sets. Auto-instrumentation does not depend on the ambient current instance and bypasses the delegating providers; those parents remain the routing mechanism only for manual telemetry created through the global OpenTelemetry API (trace.getTracer(...)), which still resolves to the ambient (viarunWithInstance) or default instance.A365 export & defaults. Each instance can target A365 (
A365SpanProcessor+Agent365Exporter). When an instance targets A365 without Azure Monitor, the A365 GenAI-focused instrumentation defaults are applied per instance._applyA365InstrumentationDefaultswas moved intosrc/distro/instrumentations.tsand is now shared by both the single- and multi-instance paths (re-exported fromdistro.tsfor existing consumers).Third-party instrumentations & touching globals
Registering a third-party instrumentation. The per-instance
instrumentationOptionscovers the built-in set (HTTP, Azure SDK, DB clients, loggers, GenAI). To add an arbitrary third-party OpenTelemetry instrumentation, a customer registers it the standard OTel way and does not pass a provider:When no
tracerProvider/meterProvider/loggerProvideris supplied, OTel binds the instrumentation to the global providers — which, afterensureGlobalSetup(), are our delegating parent providers. So the third-party instrumentation transparently receives delegating tracers/meters/loggers.How its telemetry is routed. The delegating provider resolves the target instance per call (never cached): the ambient instance bound by
runWithInstance, else the default instance. So:instance.runWithInstance(() => …)lands on that instance's exporter.runWithInstance, it lands on the default instance.Why "touching globals" is safe. A third-party instrumentation (or an SDK it bundles) may reach for process-global OTel state. Each case is handled:
trace.setGlobalTracerProvider,metrics.setGlobalMeterProvider,logs.setGlobalLoggerProvider): the OTel API honors only the first registration and turns later calls into a no-op (with a diag warning).ensureGlobalSetup()installs the delegating parents once, up front, and the child providers are deliberately never set as global. A third-party instrumentation therefore cannot overwrite the delegators, hijack routing, or pin all telemetry to a single pipeline.trace.getTracerProvider()/getTracer): returns the delegating parent, so it inherits the per-call routing above.context.setGlobalContextManager,propagation.setGlobalPropagator): also once-only, and installed once byensureGlobalSetup()as a single sharedAsyncLocalStorageContextManager+CompositePropagator. Context and propagation are intentionally process-wide and shared by every instance (the ambientcurrent instanceid itself rides on the active context), so a third-party instrumentation usingcontext.active()/.with()/propagation interoperates correctly and can't install a competing manager.http, a DB driver, etc.): this is inherently process-global and independent of providers. Because the instrumentation holds a delegating tracer, the patched code still emits into whichever instance is ambient at call time. If two instances enable the same built-in instrumentation, each gets its own instrumentation object bound to its own providers; the underlying module may be wrapped by both (standard OTel behavior), but each wrapper carries its own instance-bound tracer, so spans still land on the correct instance.enableInstrumentationspatch is installed once inensureGlobalSetup(), so third-party registrations are still counted.Deterministic binding (note). Built-in per-instance instrumentations are bound directly to their instance's providers, so they need no ambient context. A globally-registered third-party instrumentation instead routes via the ambient/default instance; to pin it deterministically to one instance, run the workload it observes inside that instance's
runWithInstance(...). A future enhancement could acceptinstrumentations?: Instrumentation[]in the per-instance options and bind them directly, matching the built-in behavior.Testing
test/internal/functional/multiInstance.test.ts:runWithInstanceroute to the bound instance.instrumentationOptions(HTTP on for A, off for B) bind different instrumentation sets to distinct providers — verifying per-exporter instrumentation.Dependencies
@opentelemetry/context-async-hooksfrom transitive to a direct dependency (imported directly to register the context manager, since the multi-instance path does not callNodeSDK.start()). Nooverridesused.